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SEQUENCE LISTING 

1 (1) GENERAL INFORMATION: 

2 (i) APPLICANT: Boodhoo, Amechand 

3 Seehra, Jasbir 

4 Shaw , Gray 

5 Sako, Dianne 

6 (ii) TITLE OF INVENTION: HIGHLY PURIFIED MOCARHAGIN, A COBRA VENOM 

7 PROTEASE, POLYNUCLEOTIDES ENCODING SAME AND 

PROTEASES, AND 

8 THERAPEUTIC USES THEREOF 

9 (iii) NUMBER OF SEQUENCES: 22 

10 (iv) CORRESPONDENCE ADDRESS: 

11 (A) ADDRESSEE: Genetics Institute, Inc. 

12 (B) STREET: 87 CambridgePark Drive 

13 (C) CITY: Cambridge 

14 (D) STATE: Massachusetts 

15 (E) COUNTRY: USA 

16 (F) ZIP: 02140 

17 (v) COMPUTER READABLE FORM: 

18 (A) MEDIUM TYPE: Floppy disk 

19 (B) COMPUTER: IBM PC compatible 

20 (C) OPERATING SYSTEM: PC-DOS/MS-DOS 

21 (D) SOFTWARE: Patentln Release #1.0, Version #1.25 

22 (vi) CURRENT APPLICATION DATA: 

C--> 23 (A) APPLICATION NUMBER: US/09/996,620 

C--> 24 (B) FILING DATE: 27-NOV-2001 

25 (C) CLASSIFICATION: 

26 (vii) PRIOR APPLICATION DATA: 

27 (A) APPLICATION NUMBER: 09/026,001 

28 (B) FILING DATE: 18-FEB-1998 

2 9 (viii) ATTORNEY/AGENT INFORMATION: 

30 (A) NAME: Brown, Scott A. 

31 (B) REGISTRATION NUMBER: 32,724 

32 (C) REFERENCE /DOCKET NUMBER: GI52 93B 

33 (ix) TELECOMMUNICATION INFORMATION: 

34 (A) TELEPHONE: (617) 498-8224 

35 (B) TELEFAX: (617) 876-5851 

36 (2) INFORMATION FOR SEQ ID NO: 1: 

3 7 (i) SEQUENCE CHARACTERISTICS: 

3 8 (A) LENGTH: 3 0 amino acids 

39 (B) TYPE: amino acid 

40 (C) STRANDEDNESS : single 

41 (D) TOPOLOGY: linear 

42 (ii) MOLECULE TYPE: peptide 
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43 (iii) HYPOTHETICAL: NO 

44 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

45 Thr Asn Thr Pro Glu Gin Asp Arg Tyr Leu Gin Ala Lys Lys Tyr lie 

46 1 5 10 15 

47 Glu Phe Tyr Val Val Val Asp Asn Val Met Tyr Arg Lys Tyr 

48 20 25 30 

50 (2) INFORMATION FOR SEQ ID NO: 2: 

51 (i) SEQUENCE CHARACTERISTICS: 

52 (A) LENGTH: 48 amino acids 

53 (B) TYPE: amino acid 

54. (C) STRANDEDNESS : single 

55 (D) TOPOLOGY: linear 

56 (ii) MOLECULE TYPE: peptide 

57 (iii) HYPOTHETICAL: NO 

58 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

59 Thr Asn Thr Pro Glu Gin Asp Arg Tyr Leu Gin Ala Lys Lys Tyr He 

60 1 5 10 15 

61 Glu Phe Tyr Val Val Val Asp Asn Val Met Tyr Arg Lys Tyr Thr Gly 

62 20 25 30 

> 63 Lys Leu His Val He Thr Xaa Xaa Val Tyr Glu Met Asn Ala Leu Asn 

64 * 35 40 .45 

66 (2) INFORMATION FOR SEQ ID NO: 3: 

67 (i) SEQUENCE CHARACTERISTICS: 

68 (A) LENGTH: 15 amino acids 
6 9 (B) TYPE: amino acid 

70 (C) STRANDEDNESS: single 

71 (D) TOPOLOGY: linear 

72 (ii) MOLECULE TYPE: peptide 

73 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

74 Glu Ala Thr Glu Tyr Glu Tyr Leu Asp Tyr Asp Phe Leu Pro Glu 

75 1 5 10 15 

77 (2) INFORMATION FOR SEQ ID NO: 4: 

78 (i) SEQUENCE CHARACTERISTICS: 

79 (A) LENGTH: 15 amino acids 

80 (B) TYPE: amino acid 

81 (C) STRANDEDNESS: single 

82 (D) TOPOLOGY: linear 

83 (ii) MOLECULE TYPE: peptide 

84 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4t 

85 Gin Ala Thr Glu Tyr Glu Tyr Leu Asp Tyr Asp Phe Leu Pro Glu 

86 1 5 10 15 
88 (2) INFORMATION FOR SEQ ID NO: 5: 

8 9 (i) SEQUENCE CHARACTERISTICS: 

90 (A) LENGTH: 2050 base pairs 

91 (B) TYPE: nucleic acid 

92 (C) STRANDEDNESS: double 

93 (D) TOPOLOGY: linear 

94 (ii) MOLECULE TYPE: cDNA 

95 (ix) FEATURE : 
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96 (A) NAME /KEY: CDS 

97 (B) LOCATION: 78.. 1940 

9 8 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

99 AGTCAATAGG AGAAGAGCTC AGGTTGGCTT GGAAGCAGAA AGAGATTCCT GTCCACCACT 60 

100 CCAATCCAGG CTCCAAAATG ATCCAAGCTC TCTTGGTAGC TATATGCTTA GCGGTTTTTC 120 

101 CATATCAAGG GAGCTCTATA ATCCTGGAAT CCGGGAATGT TAATGATTAT GAAGTAGTGT 180 

102 ATCCACAAAA AGTGCCTGCA TTGTCCAAAG GAGGAGTTCA GAATCCTCAG CCAGAGACCA 240 

103 AGTATGAAGA TACAATGCAA TATGAATTTC ACGTGAACGG AGAGCCAGTG GTCCTTCACT 3 00 

104 TAGAAAGAAA TAAAGGACTT TTTTCAGAAG ' ATTACACTGA AACTCATTAT GCCCCTGATG 360 

105 GCAGAGAAAT TACAACAAGC TCTCCAGTTC AGGATCACTG CTATTATCAT GGTTACATTC 42 0 

106 AGAATGAAGC TGACTCAAGT GCAGTCATCA GTGCATGTGA TGGCTTGAAA GGACATTTCA 480 

107 AGCATCAAGG GGAGACATAC TTTATTGAGC CCTTGGAGCT TTCTGACAGT GAAGCCCATG 540 

108 CAATATACAA AGATGAAAAT GTAGAAGAAG AGGAAGAGAT CCCCAAAATC TGTGGGGTTA 600 

109 CCCAGACTAC TTGGGAATCA GATGAGCCGA TTGAAAAGTC CTCTCAGTTA ACTAATACTC 66 0 

110 CTGAACAAGA CAGGTACTTG CAGGCCAAAA AATACATCGA GTTTTACGTG GTTGTGGACA 72 0 

111 ATGTAATGTA CMGRAAATAC ACCGGCAAGT TACATGTTAT AACAAGAAGA GTATATGAAA 780 

112 TGGTCAACGC TTTAAATACG ATGTACAGAC GTTTGAATTT TCACATAGCA CTGATTGGCC 84 0 

113 TAGAAATTTG GTCCAACGGA AATGAGATTA ATGTGCAATC AGACGTGCAG GCCACTTTGG 900 

114 ACTTATTTGG AGAATGGAGA GAAAATAAAT TGCTGCCACG CAAAAGGAAT GATAATGCTC 960 

115 AGTTACTCAC GAGCACTGAG TTCAATGGAA CTACTACAGG ACTTGGTTAC ATAGGCTCCC 102 0 

116 TCTGTAGTCC GAAGAAATCT GTGGCAGTTG TTCAGGATCA TAGCAAAAGC ACAAGCATGG 108 0 

117 TGGCAATTAC AATGGCCCAT CAGATGGGTC ATAATCTGGG CATGAATGAT GACAGAGCTT 1140 

118 CCTGTACTTG TGGTTCTAAC AAATGCATTA TGTCTACAAA ATATTATGAA TCTCTTTCTG 12 00 

119 AGTTCAGCTC TTGTAGTGTC CAGGAACATC GGGAGTATCT TCTTAGAGAC AGACCACAAT 1260 
12 0 GCATTCTCAA CAAACCCTCG CGCAAAGCTA TTGTTACACC TCCAGTTTGT GGAAATTACT 132 0 

121 TTGTGGAGCG GGGAGAAGAA TGTGACTGTG GCTCTCCTGA GGATTGTCAA AATACCTGCT 13 80 

122 GTGATGCTGC AACTTGTAAA CTGCAACATG AGGCACAGTG TGACTCTGGA GAGTGTTGTG 144 0 

123 AGAAATGCAA ATTTAAGGGA GCAGGAGCAG AATGCCGGGC AGCAAAGAAT GACTGTGACT 1500 

124 TTCCTGAACT CTGCACTGGC CGATCTGCTA AGTGTCCCAA GGACAGCTTC CAGAGGAATG 156 0 
12 5 GACATCCATG CCAAAACAAC CAAGGTTACT GCTACAATGG GACATGTCCC ACCTTGACAA 162 0 
12 6 AC CAATGTGC TACTCTCTGG GGGCCAGGTG CAAAAATGTC TCCAGGTTTA TGTTTTATGT 1680 
12 7 TGAACTGGAA TGCCCGAAGT TGTGGCTTGT GCAGAAAGGA AAATGGCAGA AAGATTCTAT 174 0 
12 8 GTGCAGCAAA GGATGTAAAG TGTGGCAGGT TATTTTGCAA AAAGAAAAAC TCGATGATAT 1800 

12 9 GCCACTGCCC ACTCCATCAA AGGACCCAAA TTATGGAATG GTTGCACCTG GAACAAAATG 1860 

13 0 TGGAGTTAAA AAGGTGTGCA GAAACAGGCA ATGTGTTAAA GTATAGACAG CCAACTGATC 192 0 

131 AAGCACTGCT TCTCTCAATT TGATTTTGGA GATCCTCCTT CCAGAAGGCT TTCCTCAAGT 198 0 

132 C CAAAG AG AC CCATCTGTCT TTATCCTACT AGTAAATCAC TCTTAGCTTT CAAAAAAAAA 2040 

133 AAAAGTCGAC 2050 
135 (2) INFORMATION FOR SEQ ID NO: 6: 



13 6 (i) SEQUENCE CHARACTERISTICS: 

137 (A) LENGTH: 621 amino acids 

138 (B) TYPE: amino acid 

139 (C) STRANDEDNESS : 

140 (D) TOPOLOGY: linear 

141 (ii) MOLECULE TYPE: protein 

142 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

143 Met He Gin Ala Leu Leu Val Ala He Cys Leu Ala Val Phe Pro Tyr 

144 15 10 15 

145 Gin Gly Ser Ser He He Leu Glu Ser Gly Asn Val Asn Asp Tyr Glu 
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222 (2) INFORMATION FOR SEQ ID NO: 7: 

223 (i) SEQUENCE CHARACTERISTICS: 

224 (A) LENGTH: 22 97 base pairs 

225 (B) TYPE: nucleic acid 

226 (C) STRANDEDNESS : double 

227 (D) TOPOLOGY: linear 

228 (ii) MOLECULE TYPE: cDNA 

229 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



232 GTTTTTCCAT ATCAAGGGAG CTCTATAATC 

233 GTTGTGTATC CACAAAAAGT GCCTGCATTG 

234 GAGACCAAGT ATGAAGATAC AATGCAATAT 

235 CTTCACTTAG AAAGAAATAA AGGACTTTTT 

236 CCTGATGGCA GAGAAATTAC AACAAGCCCT 

237 TACATTCAGA ATGAAGCTGA CTCAAGTGCA 

238 CATTTCAAGC ATCAAGGGGA GACATACTTT 



244 ATTGGCCTAG AAATTTGGTC CAACCATGAT AAGTTTGAAG 
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